feasible plan
NAMO-LLM: Efficient Navigation Among Movable Obstacles with Large Language Model Guidance
Zhang, Yuqing, Kantaros, Yiannis
Several planners have been proposed to compute robot paths that reach desired goal regions while avoiding obstacles. However, these methods fail when all pathways to the goal are blocked. In such cases, the robot must reason about how to reconfigure the environment to access task-relevant regions - a problem known as Navigation Among Movable Objects (NAMO). While various solutions to this problem have been developed, they often struggle to scale to highly cluttered environments. To address this, we propose NAMO-LLM, a sampling-based planner that searches over robot and obstacle configurations to compute feasible plans specifying which obstacles to move, where, and in what order. Its key novelty is a non-uniform sampling strategy guided by Large Language Models (LLMs) biasing the tree construction toward directions more likely to yield a solution. We show that NAMO-LLM is probabilistically complete and demonstrate through experiments that it efficiently scales to cluttered environments, outperforming related works in both runtime and plan quality.
MPCC: A Novel Benchmark for Multimodal Planning with Complex Constraints in Multimodal Large Language Models
Ji, Yiyan, Chen, Haoran, Chen, Qiguang, Wu, Chengyue, Qin, Libo, Che, Wanxiang
Multimodal planning capabilities refer to the ability to predict, reason, and design steps for task execution with multimodal context, which is essential for complex reasoning and decision-making across multiple steps. However, current benchmarks face two key challenges: (1) they cannot directly assess multimodal real-world planning capabilities, and (2) they lack constraints or implicit constraints across modalities. To address these issues, we introduce Multimodal Planning with Complex Constraints (MPCC), the first benchmark to systematically evaluate MLLMs' ability to handle multimodal constraints in planning. To address the first challenge, MPCC focuses on three real-world tasks: Flight Planning, Calendar Planning, and Meeting Planning. To solve the second challenge, we introduce complex constraints (e.g. budget, temporal, and spatial) in these tasks, with graded difficulty levels (EASY, MEDIUM, HARD) to separate constraint complexity from search space expansion. Experiments on 13 advanced MLLMs reveal significant challenges: closed-source models achieve only 21.3% feasible plans, while open-source models average below 11%. Additionally, we observe that MLLMs are highly sensitive to constraint complexity and that traditional multimodal prompting strategies fail in multi-constraint scenarios. Our work formalizes multimodal constraints in planning, provides a rigorous evaluation framework, and highlights the need for advancements in constraint-aware reasoning for real-world MLLM applications.
A Temporal Planning Framework for Multi-Agent Systems via LLM-Aided Knowledge Base Management
Saccon, Enrico, Tikna, Ahmet, De Martini, Davide, Lamon, Edoardo, Palopoli, Luigi, Roveri, Marco
This paper presents a novel framework, called PLANTOR (PLanning with Natural language for Task-Oriented Robots), that integrates Large Language Models (LLMs) with Prolog-based knowledge management and planning for multi-robot tasks. The system employs a two-phase generation of a robot-oriented knowledge base, ensuring reusability and compositional reasoning, as well as a three-step planning procedure that handles temporal dependencies, resource constraints, and parallel task execution via mixed-integer linear programming. The final plan is converted into a Behaviour Tree for direct use in ROS2. We tested the framework in multi-robot assembly tasks within a block world and an arch-building scenario. Results demonstrate that LLMs can produce accurate knowledge bases with modest human feedback, while Prolog guarantees formal correctness and explainability. This approach underscores the potential of LLM integration for advanced robotics tasks requiring flexible, scalable, and human-understandable planning.
Zero-shot Robotic Manipulation with Language-guided Instruction and Formal Task Planning
Tang, Junfeng, Ye, Zihan, Yan, Yuping, Zheng, Ziqi, Gao, Ting, Jin, Yaochu
Robotic manipulation is often challenging due to the long-horizon tasks and the complex object relationships. A common solution is to develop a task and motion planning framework that integrates planning for high-level task and low-level motion. Recently, inspired by the powerful reasoning ability of Large Language Models (LLMs), LLM-based planning approaches have achieved remarkable progress. However, these methods still heavily rely on expert-specific knowledge, often generating invalid plans for unseen and unfamiliar tasks. To address this issue, we propose an innovative language-guided symbolic task planning (LM-SymOpt) framework with optimization. It is the first expert-free planning framework since we combine the world knowledge from LLMs with formal reasoning, resulting in improved generalization capability to new tasks. Specifically, differ to most existing work, our LM-SymOpt employs LLMs to translate natural language instructions into symbolic representations, thereby representing actions as high-level symbols and reducing the search space for planning. Next, after evaluating the action probability of completing the task using LLMs, a weighted random sampling method is introduced to generate candidate plans. Their feasibility is assessed through symbolic reasoning and their cost efficiency is then evaluated using trajectory optimization for selecting the optimal planning. Our experimental results show that LM-SymOpt outperforms existing LLM-based planning approaches.
Multiagent Connected Path Planning: PSPACE-Completeness and How to Deal With It
Tateo, Davide (Politecnico di Milano) | Banfi, Jacopo (Politecnico di Milano) | Riva, Alessandro (Politecnico di Milano) | Amigoni, Francesco (Politecnico di Milano) | Bonarini, Andrea (Politecnico di Milano)
In the Multiagent Connected Path Planning problem (MCPP), a team of agents moving in a graph-represented environment must plan a set of start-goal joint paths which ensures global connectivity at each time step, under some communication model. The decision version of this problem asking for the existence of a plan that can be executed in at most a given number of steps is claimed to be NP-complete in the literature. The NP membership proof, however, is not detailed. In this paper, we show that, in fact, even deciding whether a feasible plan exists is a PSPACE-complete problem. Furthermore, we present three algorithms adopting different search paradigms, and we empirically show that they may efficiently obtain a feasible plan, if any exists, in different settings.
On a Practical, Integer-Linear Programming Model for Delete-Free Tasks and its Use as a Heuristic for Cost-Optimal Planning
We propose a new integer-linear programming model for the delete relaxation in cost-optimal planning. While a straightforward IP for the delete relaxation is impractical, our enhanced model incorporates variable reduction techniques based on landmarks, relevance-based constraints, dominated action elimination, immediate action application, and inverse action constraints, resulting in an IP that can be used to directly solve delete-free planning problems. We show that our IP model is competitive with previous state-of-the-art solvers for delete-free problems. The LP-relaxation of the IP model is often a very good approximation to the IP, providing an approach to approximating the optimal value of the delete-free task that is complementary to the well-known LM-cut heuristic. We also show that constraints that partially consider delete effects can be added to our IP/LP models. We embed the new IP/LP models into a forward-search based planner, and show that the performance of the resulting planner on standard IPC benchmarks is comparable with the state-of-the-art for cost-optimal planning.
Integration of 3D Object Recognition and Planning for Robotic Manipulation: A Preliminary Report
Duff, Damien Jade, Erdem, Esra, Patoglu, Volkan
We investigate different approaches to integrating object recognition and planning in a tabletop manipulation domain with the set of objects used in the 2012 RoboCup@Work competition. Results of our preliminary experiments show that, with some approaches, close integration of perception and planning improves the quality of plans, as well as the computation times of feasible plans.
Levels of Integration between Low-Level Reasoning and Task Planning
Erdem, Esra, Patoglu, Volkan, Schüller, Peter
We provide a systematic analysis of levels of integration between discrete high-level reasoning and continuous low-level reasoning to address hybrid planning problems in robotics. We identify four distinct strategies for such an integration: (i) low-level checks are done for all possible cases in advance and then this information is used during plan generation, (ii) low-level checks are done exactly when they are needed during the search for a plan, (iii) first all plans are computed and then infeasible ones are filtered, and (iv) by means of replanning, after finding a plan, low-level checks identify whether it is infeasible or not; if it is infeasible, a new plan is computed considering the results of previous lowlevel checks. We perform experiments on hybrid planning problems in robotic manipulation and legged locomotion domains considering these four methods of integration, as well as some of their combinations. We analyze the usefulness of levels of integration in these domains, both from the point of view of computational efficiency (in time and space) and from the point of view of plan quality relative to its feasibility. We discuss advantages and disadvantages of each strategy in the light of experimental results and provide some guidelines on choosing proper strategies for a given domain. Keywords: Task planning, geometric reasoning, answer set programming.